A simple model to study phylogeographies and speciation patterns in space 
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In this working paper, we present a simple theoretical framework based on network theory to 
study how speciation, the process by which new species appear, shapes spatial patterns of diversity. 
We show that this framework can be expanded to account for different types of networks and 
interactions, and incorporates different modes of speciation. 
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I. MOTIVATION 

The peculiar spatial relationship between closely re- 
lated species was among the first patterns of diversity 
used to infer evolution. As early as the 1850s, Alfred 
Wallace noted that the closest relatives were often ob- 
served in adjacent yet non-overlapping regions [33, 34]. 
Wagner and Jordan later relied on a similar observation 
to argue for the importance of geography and isolation in 
the formation of new species [3] . And finally, Mayr de- 
veloped a theory of allopatric speciation, a cornerstone 
of the modern synthesis, again using similar observations 
[22, 23]. The relationship between phylogeny and ge- 
ography has shaped our understanding of the origin of 
species [3, 19]. It is also crucial to the development of a 
unified theory of community assembly [26, 29]. Yet, the- 
ory remains mostly silent about the subject. Few models 
can generate phylogeographies, and none can be used to 
study the effect of complex spatial structures [1] . This is 
surprising, not only because of the theoretical importance 
of phylogeography, but also because several phylogenetic 
methods use geography to infer patterns of speciation 
[1, 20, 21]. 

Part of the problem lies in the limitations of traditional 
mathematical methods: analytical solutions to spatially- 
explicit models are often only available for the most triv- 
ial cases [9]. Thus, we are left with no theoretical frame- 
work to study the patterns noted by Wallace, Wagner, 
and Mayr. In this document, we describe a very simple 
algorithm to generate phylogeographies in spatial net- 
works. Our approach is inspired by metapopulation the- 
ory [13, 14, 17] although the spatio-temporal scale is dif- 
ferent: we're interested in the dynamics of populations 
at the regional scale during long periods. The model 
will be used to study phylogeographies in various spatial 
contexts and to develop better tools to understand the 
relationship between phylogeny and geography. 

We use the term "phylogeography" in the general 
sense: it is the union of phylogenetics with geography. 
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Our approach emphasizes how spatial patterns of spe- 
ciation shape biodiversity. It cannot be used to study 
within-species variations, a major focus of phylogeogra- 
phy [12]. This is more consistent with the field known as 
comparative phylogeography. 



II. MODELING THE LANDSCAPE 

We model the landscape as a spatial network of com- 
munities. A network is a flexible mathematical object 
defined as a set of vertices V and a set of edges E, which 
are used to connect the vertices [27]. Here, the vertices 
represent communities and the edges denote migration 
[5-8]. Spatial networks are simply networks in which 
vertices are embedded in a known topological space [16], 
in our case a two-dimensional map. Thus, each commu- 
nity is represented by a vertex in the network and to a 
position on a map. Networks are increasingly common 
in ecology as they can be used to model complex struc- 
tures and quantify the effect of clustering, connectivity, 
and isolation [4, 24, 25, 31]. In particular, isolation is 
the most important factor in many speciation events [3] , 
making networks well-suited to study patterns of speci- 
ation in different contexts [5, 6]. The spatial network 
can be built in two ways. First, random geometric net- 
works can be generated by randomly placing the vertices 
on a surface, normally the unit square, and linking all 
communities within some threshold distance [28]. This 
technique is used to test network algorithms applied to 
maps [30]. Second, a real map can be used as a template 
for a spatial network [4, 24]. This method offers the op- 
portunity to generate predictions specific to a given spa- 
tial structure, and test the predictions of our algorithm 
against empirical data. 



III. THE MODEL 

A species is divided in populations which are dis- 
tributed in a network of communities. A species is either 
present or absent in a community, we do not keep track 
of the number of individuals. Occupancy thus follows the 




FIG. 1: Top: a phylogeography with four species (yellow, 
blue, green, pink). The populations are distributed in a spa- 
tial network, with each community (circles) hosting popula- 
tions from or more species. Empty communities are white 
and a gradient is used for communities with more than one 
species. The communities are connected by migration (thin 
black lines). Bottom: a speciation event. The pink species 
is divided in three groups of populations. Its leftmost group 
undergoes speciation and a connected subgroup now belongs 
to a new species (in red). 



standard colonization/extinction dynamics of metapop- 
ulation theory [13]. For each time step, all populations 
have the opportunity to colonize adjacent communities 
(the vertices connected by an edge in the network). The 
probability of a successful colonization of community x 
by species i is 



possible colonization rate and H a positive constant (with 
H > 0) . H describes the decline of the intensity of interac- 
tions with phylogenetic divergence. In short, a higher H 
makes it difhcult for closely related species to coexist. Cix 
is a very simple function derived from exponential decay. 
It is based on an old hypothesis by Darwin: closely re- 
lated species are more likely to compete. It has recently 
received experimental support [15, 32]. A strong assump- 
tion of trait conservatism underlies the model [18]. At 
each time step, all populations have the same probability 
e of extinction. Speciation occurs in groups of popula- 
tions. We define a group as a set of connected popula- 
tions from the same species (Fig. 1). Each group has a 
probability v of undergoing speciation. When speciation 
occurs in a group, a random subset of [l,n] connected 
populations will speciate, with n being the number of 
populations in the original group (Fig. 1). 



IV. VARIATIONS 

The basic model can easily be extended to account for 
various types of interactions. In this section we briefly 
discuss a few extensions. 



A. Allopatric speciation 

Our model is mostly parapatric, with strictly allopatric 
speciation occurring only with probability 1/n, with n 
being the size of the group to speciate. An alternative is 
to always force allopatric speciation by making the entire 
group speciate. 



B. Sympatric speciation? 

With few solid cases of sympatric speciation, and many 
of them involving important allopatric/parapatric phases 
[2, 3, 11], it is hard to decide how to do a phenomeno- 
logical sympatric speciation model. Furthermore, the as- 
sumption of strong niche conservatism would be hard to 
maintain, as niche overlap between diverging populations 
is one of the hardest challenge for sympatric speciation. 
Nevertheless, if enough sympatric speciation events can 
be analyzed, our model could be modified to allow sym- 
patric, parapatric, and allopatric speciation. 
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with {Sx\ i} being the set of populations present in 
community x minus i, 6ij is the time since species i and 
j's most recent common ancestor, Cmax is the highest 



C. Variable a 

H is fixed in the original model, but it could vary in 
time and space. For example, smaller regions could have 
higher H to account for a lower carrying capacity. 



D. Variable extinction rates 

The extinction rate could have the same form as 
the colonization rate and be affected by closely related 
species. 

E. Variables v 

The speciation rate could decrease with higher diver- 
sity (more niches are filled) or increase ( "diversity begets 
diversity") [10]. 

F. Growing food webs 

The basic idea of using spatial networks and groups 
of connected populations for speciation could be used 
to model how complex food webs grow with speciation 
events. This integration would, however, require many 
new assumptions and a more sophisticated model for the 
colonization and extinction rates. 

Integrating food web dynamics lead to some difficul- 
ties. For example, a trophic model would involve very 



different species with potentially different rates of disper- 
sal. The threshold value r used to determine the realized 
links in the spatial network would have to be different 
for each group of species. For example, group-specific 
threshold values could be linked to the niche value (i.e.: 
smaller species have lower dispersal ranges) . A connected 
random geometric networks could then be generated with 
the lowest threshold value, ensuring that all networks are 
fully connected. 



G. Positive interactions 

Positive interactions between closely related species are 
also possible, for example in plants. This variation can 
be achieved by making Cix increase when related species 
are present. 



V. IMPLEMENTATION 

An open-source implementation is available on github: 
https : / /github . com/PhDP / wagner . 
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